Workflow and Best Practices

Conventional algorithms are designed to answer exactly one question. In contrast, deep models based on convolutional neural networks (CNNs) can be taught to "learn" patterns and be adapted to a wide variety of problems and tasks, such as denoising, super-resolution, and segmentation. Data, which is used to train a deep model to understand how to learn and apply concepts those concepts, are the most crucial aspect that makes training possible. As shown in the following flowchart, three different data sets — the training set, the validation set, and the testing set — are used for training, fine-tuning, and testing.

Deep learning workflow

Training set… Is the portion of data used to train models. The model learns from this data to apply concepts such denoising, super-resolution, and semantic segmentation and to produce results. Training sets can include the training input(s) and output(s), as well as an optional mask(s). If training is going well, every iteration updates the weights of the network so it becomes better and better at predicting on the training set.

Test set… Is used only when the final model is completely trained to assess the performance of the final model.

Validation set… Is a subset of the training data that provides an unbiased evaluation of a model. Overfitting is checked and avoided with the validation set.

The following 'best practices' can help you get the most from your project:

Explore whether your project is better suited to classical machine learning or deep learning.
Make your data suitable for model training (see Data Preparation). For example, using calibrated datasets is recommended for developing 'universal' models.
The quality of your training sets will determine the performance of the predictive model. This means that you should have a strategy for continuous improvement of your training set, for as long as there’s any benefit to better model accuracy. It can happen that you lack the data required to integrate a deep learning solution. In this case, the Deep Learning Tool provides the option to artificially augment the available data when you set the model training parameters (see Data Augmentation Settings).
Find the best balance between training progress and training time. For example, with limited computational power this might mean training models with 2D inputs on multiple diverse training datasets instead to using multi-slice or 3D inputs.
It is often advantageous to start with a model that has been pre-trained (see Pre-Trained Models).
If you are unsure where to start, use the Segmentation Wizard instead of the Deep Learning Tool or the Machine Learning Segmentation module (see Segmentation Wizard).